UMass at TDT 2000
نویسندگان
چکیده
We had two thrusts to our research, neither of which was ready to be deployed in this evaluation. We report here on the results from the training data, in all cases explored within the link detection task. In the first direction, we looked more carefully at score normalization across different languages and media types. We found that we could improve results noticeably though not substantially by normalizing scores differently depending upon the source language. In the second direction, we considered smoothing the vocabulary in stories using a “query expansion” technique from Information Retrieval to add additional words from the corpus to each story. This resulted in substantial improvements.
منابع مشابه
Track detection on the cells exposed to high LET heavy-ions by CR-39 plastic and terminal deoxynucleotidyl transferase (TdT)
Background: The fatal effect of ionizing radiation on cells depends on Linear Energy Transfer (LET) level. The distribution of ionizing radiation is sparse and homogeneous for low LET radiations such as X or γ, but it is dense and concentrated for high LET radiation such as heavy-ions radiation. Material and Methods: Chinese hamster ovary cells (CHO-K1) were exposed to 4 Gy Fe-ion 2000 keV/...
متن کاملQuality Control in Large Annotation Projects Involving Multiple Judges: The Case of the TDT Corpora
The Linguistic Data Consortium at the University of Pennsylvania has recently been engaged in the creation of large-scale annotated corpora of broadcast news materials in support of the ongoing Topic Detection and Tracking (TDT) research project. The TDT corpora were designed to support three basic research tasks: segmentation, topic detection, and topic tracking in newswire, television and rad...
متن کاملUMass at TDT 2004
Topic Detection classifies stories into different topics, but HTD requires more than that. Is there any other entities between a story and a topic? [10] views a topic as a structure of inter-related events, which gives us a good hint for this new task. Experiments in [10] show that time locality is a very useful attribute in event organization, and it can also help to solve the complexity probl...
متن کاملMultiple Annotations of Reusable Data Resources: Corpora for Topic Detection and Tracking
Responding to demands for very large, easily accessible, reusable news corpora to support research in the topic detection and tracking paradigm, the Linguistic Data Consortium created the TDT corpora. In addition to supporting research in the Topic Detection and Tracking program, the TDT corpora were collected and annotated with an eye toward reuse and re-annotation. Their value is confirmed in...
متن کاملResults of the 1999 topic detection and tracking evaluation in Mandarin and English
The National Institute of Standards and Technology (NIST) administered the second open evaluation of Topic Detection and Tracking (TDT) technologies in 1999. The TDT project supports development of technologies that automatically organize event-related news stories. The program leverages expertise in core technologies, Automatic Speech Recognition (ASR), Document Retrieval (DR), and Machine Tra...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2000